I want to explore the state of possession and its effect on scoring in NBA basketball. The state of possession is defined as how an offensive possession begins. For example, if the opponent makes their shot from the previous play, the possession starts from inbounding the ball under the basket. If a shot is missed, it can be rebounded by either team, and the possession begins where the rebound is retrieved. If the ball is stolen, the possession starts at perhaps the most advantageous state, as the defense is not in position to defend. I believe such information is important, and can help uncover team and player strengths and weaknesses that are typically overlooked by analysts.

Univariate plots

possession start counts

Most possessions begin with the ball inbounded after a made shot. The second most likely scenario is a defensive rebound after a missed shot by the opponent.

possession result counts

##      res                 cnt        
##  Length:14          Min.   :   312  
##  Class :character   1st Qu.: 20172  
##  Mode  :character   Median :182304  
##                     Mean   :235213  
##                     3rd Qu.:278779  
##                     Max.   :917114

Missed and made 2 point field goals dominate how possessions end.

possession result counts

##       dist           pts              cnt        
##  Min.   : 0.0   Min.   :   856   Min.   :  1362  
##  1st Qu.: 7.5   1st Qu.: 33166   1st Qu.: 38894  
##  Median :15.0   Median : 43083   Median : 52650  
##  Mean   :15.0   Mean   : 74808   Mean   : 73804  
##  3rd Qu.:22.5   3rd Qu.: 74436   3rd Qu.: 88403  
##  Max.   :30.0   Max.   :476090   Max.   :367763

This first of the two graphs is the shot distribution by distance. It appears to be a tri-modal distribution. There are local peaks around 0, 18, 25 ft shots. The NBA 3 point shot is 22 ft away on the sidelines and 23 ft 9 in elsewhere, which explains the 24 ft peak.

This second of the two graphs is the shot efficacy by distance. It appears to be bimodal. There are local peaks very close to the basket and the 3 pt line. In other words, going to the basket and 3 point shots are the most efficient shots in the NBA.

Bivariate plots

start efficacy

##      sta                 pts               cnt         
##  Length:15          Min.   :   1112   Min.   :   1486  
##  Class :character   1st Qu.:  23534   1st Qu.:  32549  
##  Mode  :character   Median :  57520   Median :  72832  
##                     Mean   : 185587   Mean   : 218217  
##                     3rd Qu.: 192752   3rd Qu.: 205939  
##                     Max.   :1036079   Max.   :1261515

The intuition that the stealing of the basketball (live to) leads to the highest propensity to score is confirmed. Offensive rebounds of field goals (live off fg reb) is second. Other important observations are that it is more favorable to start possessions from defensive rebounds (live def fg reb) than inbounding after a made shot (made shot). This matters because they are the two most frequently occurring possession beginnings.

result efficacy

##      res                 pts               cnt        
##  Length:13          Min.   :      0   Min.   :   312  
##  Class :character   1st Qu.:      6   1st Qu.: 21533  
##  Mode  :character   Median :     13   Median :187479  
##                     Mean   : 214138   Mean   :251790  
##                     3rd Qu.:   2954   3rd Qu.:301716  
##                     Max.   :1757473   Max.   :917114

The deviance from whole numbers are due to shot made plus one more free throw scenarios (and 1), as well as missed free throws in general.

shot counts by season

##      season          dist         pts             cnt       
##  Min.   :2005   Min.   : 0   Min.   :   30   Min.   :   66  
##  1st Qu.:2007   1st Qu.: 7   1st Qu.: 2756   1st Qu.: 3375  
##  Median :2010   Median :15   Median : 3942   Median : 4645  
##  Mean   :2010   Mean   :15   Mean   : 6801   Mean   : 6709  
##  3rd Qu.:2013   3rd Qu.:23   3rd Qu.: 7084   3rd Qu.: 7720  
##  Max.   :2015   Max.   :30   Max.   :68880   Max.   :54955

This is the shot count for each distance plotted over NBA seasons. The darker the gradient, the more recent. The interesting observation is the consistent decrease in mid-range shots taken. This change is echoed by an increase in 3 point shots taken.

shot efficacy by season

The same but looking at efficacy. Short range shots 9 ft and under appears to be losing its competence.

ggpairs on state and shot types

The possession beginnings have been binned into four major categories denoted by S1 through S4. Shots also received the same treatment and are binned by distance.

  • S1: dead ball, mostly comprised of made shots
  • S2: rebound from opponent missed shot
  • S3: ball stolen
  • S4: rebound from own missed shot

  • close: 3 ft and under
  • short: 4 ft - 11 ft
  • mid: 2 point shots over 11 ft
  • 3pt: 3 point shots

Most possessions start in dead ball, followed by missed shots, as expected. Steals and offensive rebounds occur at about the same rate. Close shots and mid range shots are very frequent, followed by 3 point shots, short shorts happen the least. The cross variable plots are difficult to interpret, other plots will be used.

Players and average points scored by shot type

##       sbin         actor                pts             cnt        
##  close  :1176   Length:4529        Min.   :    0   Min.   :   1.0  
##  short  :1147   Class :character   1st Qu.:   18   1st Qu.:  23.0  
##  mid    :1169   Mode  :character   Median :  123   Median : 142.0  
##  3pt    :1037                      Mean   :  512   Mean   : 505.2  
##  no shot:   0                      3rd Qu.:  583   3rd Qu.: 623.0  
##                                    Max.   :10896   Max.   :8971.0

Close shots are the best, and most players make them very well. 3 point shots are the second best on average. However, there appears to be a fat lower tail, suggesting it’s only a good shot for those that can make it. The 25-75 percentile for mid range is flatter than other shots, suggesting most players tend not to better or worse than each other.

Players and average points scored by state

##     state              actor                pts               cnt         
##  Length:4637        Length:4637        Min.   :    0.0   Min.   :    1.0  
##  Class :character   Class :character   1st Qu.:   24.0   1st Qu.:   25.0  
##  Mode  :character   Mode  :character   Median :  134.0   Median :  130.0  
##                                        Mean   :  500.1   Mean   :  493.4  
##                                        3rd Qu.:  530.0   3rd Qu.:  493.0  
##                                        Max.   :11890.0   Max.   :11707.0

live ball turnovers (steals) have the highest scoring potential, followed by offensive rebounds. Possessions beginning after a defensive rebound has noticeably higher scoring potential than a dead ball.

Teams and attempts by shot type

##       sbin        off                 pts             cnt       
##  close  :30   Length:120         Min.   : 7740   Min.   : 9787  
##  short  :30   Class :character   1st Qu.:13583   1st Qu.:13935  
##  mid    :30   Mode  :character   Median :18585   Median :20615  
##  3pt    :30                      Mean   :19325   Mean   :19066  
##  no shot: 0                      3rd Qu.:24638   3rd Qu.:23720  
##                                  Max.   :37383   Max.   :29175

The differentiation of shot selection is more clear here. The Houston Rockets really dislike mid range shots. The Denver Nuggets are very good at getting close shots.

Teams and points per attempt by shot type

It’s interesting how teams don’t seem to differentiate from each other in points from mid range shots. It shows that it is where shots go to die, as it is almost never your competitive edge. With the exception of Dallas. It’s immediately clear to a NBA fan why that is, his name is Dirk Nowitzki.

Teams and points per possession by state

##     state               off                 pts             cnt       
##  Length:120         Length:120         Min.   : 6761   Min.   : 5711  
##  Class :character   Class :character   1st Qu.: 8447   1st Qu.: 7257  
##  Mode  :character   Mode  :character   Median :12708   Median :12313  
##                                        Mean   :19325   Mean   :19066  
##                                        3rd Qu.:25808   3rd Qu.:25528  
##                                        Max.   :48999   Max.   :48279

This plot shows which state the teams thrive and fail in. The Phoenix Suns has an uncanny ability to score with dead balls, which can only be attributed to their 2 time MVP point guard Steve Nash. It’s also clear now that Denver has so many close shot attempts because of their high steal rate.

Multivariate

Teams and attempts by shot type by season

##      season          sbin         off                 pts      
##  Min.   :2005   close  :330   Length:1320        Min.   : 469  
##  1st Qu.:2007   short  :330   Class :character   1st Qu.:1094  
##  Median :2010   mid    :330   Mode  :character   Median :1678  
##  Mean   :2010   3pt    :330                      Mean   :1757  
##  3rd Qu.:2013   no shot:  0                      3rd Qu.:2329  
##  Max.   :2015                                    Max.   :4228  
##       cnt      
##  Min.   : 625  
##  1st Qu.:1242  
##  Median :1768  
##  Mean   :1733  
##  3rd Qu.:2164  
##  Max.   :3300

Lots of interesting movements here. The Houston Rockets really made a conscious effort to avoid mid-range shots and up their 3pt attempts. Daryl Morey is known for his early adaptation of analytics and it’s effect is apparent. It’s also interesting to see that Philadelphia Sixers also “took” the same approach to become a historically awful team.

Teams and points per attempts by shot type by season

San Antonio Spurs are just good every season, in every respect. It is inconceivable how they are able to keep this up. All this analyts (myself included) should just leave our computers and go watch the Spurs practice. The warriors have become the outlier in 3 point efficacy in the recent season due to the splash brothers.

Teams and attempts allowed by shot type by season

##      season          sbin         def                 pts      
##  Min.   :2005   close  :330   Length:1321        Min.   :   3  
##  1st Qu.:2007   short  :330   Class :character   1st Qu.:1184  
##  Median :2010   mid    :330   Mode  :character   Median :1686  
##  Mean   :2010   3pt    :331                      Mean   :1756  
##  3rd Qu.:2013   no shot:  0                      3rd Qu.:2228  
##  Max.   :2015                                    Max.   :3588  
##       cnt      
##  Min.   :   1  
##  1st Qu.:1275  
##  Median :1780  
##  Mean   :1732  
##  3rd Qu.:2147  
##  Max.   :3202

This plot speaks volume on the true colors of a good defense. It is much more about making opponents take unfavorable shots than making them miss their shots. Teams that have been known for their defensive prowess at the given time have all forced opponents to take mid range and short ranged shots. Orlando during the Dwight Howard days is quite phenomenal, which is quite a shame given the Houston team he is on now is nowhere near that high. It’s also a paradox that Houston, the advocate of 3 point shot lives by it offensively, but also dies by it defensively.

Teams and points per attempts allowed by shot type by season

The general lack of variance here is another testament to the previous theory. Especially the mid-range, quality of defense doesn’t seem to effect quality of shot. So by your opponent taking the mid range you have essentially won the battle.

The following three plots will be generated here to meet criteria but discussed in the final section.

Shot efficacy by distances by state of possession

##     state                sbin         actor                pts        
##  Length:16282       close  :4455   Length:16282       Min.   :   0.0  
##  Class :character   short  :4088   Class :character   1st Qu.:   6.0  
##  Mode  :character   mid    :4206   Mode  :character   Median :  27.0  
##                     3pt    :3533                      Mean   : 142.4  
##                     no shot:   0                      3rd Qu.: 124.0  
##                                                       Max.   :5781.0  
##       cnt              ppa        
##  Min.   :   1.0   Min.   :0.0000  
##  1st Qu.:   6.0   1st Qu.:0.6667  
##  Median :  30.0   Median :0.9091  
##  Mean   : 140.5   Mean   :0.9159  
##  3rd Qu.: 127.0   3rd Qu.:1.1892  
##  Max.   :5954.0   Max.   :4.0000

Shot occurrence by distances by state of possession

##     state                sbin         actor                pct          
##  Length:16384       close  :4455   Length:16384       Min.   :0.005291  
##  Class :character   short  :4088   Class :character   1st Qu.:0.075173  
##  Mode  :character   mid    :4206   Mode  :character   Median :0.203334  
##                     3pt    :3635                      Mean   :0.278137  
##                     no shot:   0                      3rd Qu.:0.436361  
##                                                       Max.   :1.000000

Player defensive contribution

##      item              season             state          
##  Length:203560      Length:203560      Length:203560     
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##     result             value          
##  Length:203560      Length:203560     
##  Class :character   Class :character  
##  Mode  :character   Mode  :character

Final Plots and Summary

Plot One

The final plot choices were chosen because they most directly answer the question posed. What is the relationship between state possession and the shots taken?

  1. Close ranged shots are always better, always. Attempts to get shots 3 feet or closer to the basket should be first priority. 3 point shots are the next best deal.
  2. Dead balls are generally the worst state to score in. The defense is always at a disadvantage when there is less time to make preparations.
  3. Short and mid range shots should be of last resort, unless there are players who exceed the average significantly to warrant it.
  4. Surprisingly, close shots from offensive rebounds are the worst kind of close shots. It is clear from earlier that offensive rebounds had the second highest scoring potency. So how? The next plot should clarify.

Plot Two

Even though it was seen earlier that more favorable states allow shots to be made more easily, the true value of a favorable state may be its ability to land you a close ranged shot. Dead ball states are the worst because it is harder to get a close shot. It is also now understandable why offensive rebounds have high payoffs, simply because it is most likely to end as a close shot, even though they are the close shots that are least likely to be made.

No. 3 - Defensive efficacy against shots by distance type by state of possession

The data from this chart comes from the player coefficients of a ridge regression. This method is used to extract values that is referred to in advanced basketball metrics as RAPM, or real adjusted plus minus. The coefficients pertain to the defensive end only, and speaks on the defensive impact players have on the given topic. A player who has the tendency to lower opponent’s ability to make a close shot by 2% during a dead ball would be marked as a data point of -2 on fgcp and S1. Boxplots that are flat indicate players generally have no defensive impact on the given topic. State S0 is introduced here representing all possessions.

This chart is enlightening in that it shows where defensive impact on shots is most pronounced, both positively and negatively. Centers, or big men, are valued for their ability to guard around the basket, thus lowering both rate of close shots and its success. This fact is well-known, and is evident here in the form of large variance in player efficacy in defending close ranged shot in all possession states. Less known impacts can also be observed by the variance in ability to deter and induce mid range and 3 point shots. These observations can be important additional information regarding a player’s defensive efficacy. Earlier it was established that mid range shots can generally be regarded as the worst shot in basketball. A player’s ability to deter all other outcomes and inducing mid range is perhaps an overlooked but important aspect of defensive effectiveness.

Reflection

This EDA was able to explore key points that confirm popular wisdom as well as dig into seldom discussed topics in advanced basketball analytics that appear to have significant impact on the game. Investigation of shot behavior in various possessoin states give compelling evidence to generally encourage an offensive strategy in getting more close shots and 3 point shots, and avoiding short and mid range shots. It is important to note these are general investigations on the league as a whole, and can not be assumed for players with extraordinary abilities. It is common sense to play to your strengths, and devise strategies that make the most sense for your roster. The importance of understanding state of possessions has yet to be fully explored in this analysis, as it can become valuable to understand your own and opponents strengths and weaknesses in various states and work to reduce scenarios that play to their strengths and induce scenarios that play to yours.

Extended amount of time was commmitted to curating raw play-by-play data to obtain the datasets used for this exploratory data analysis. Therefore, not too many obstacles were in the way of conducting the intended analysis. However, there was a lack of numerical and ordinal variables which limited opportunities to conduct methods relating to correlation and true scatterplots.